Gaussian Visual-Linguistic Embedding for Zero-Shot Recognition
نویسندگان
چکیده
An exciting outcome of research at the intersection of language and vision is that of zeroshot learning (ZSL). ZSL promises to scale visual recognition by borrowing distributed semantic models learned from linguistic corpora and turning them into visual recognition models. However the popular word-vector DSM embeddings are relatively impoverished in their expressivity as they model each word as a single vector point. In this paper we explore word-distribution embeddings for ZSL. We present a visual-linguistic mapping for ZSL in the case where words and visual categories are both represented by distributions. Experiments show improved results on ZSL benchmarks due to this better exploiting of intra-concept variability in each modality
منابع مشابه
LONG, LIU, SHAO: ATTRIBUTE EMBEDDING WITH VSAR FOR ZERO-SHOT LEARNING 1 Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning
Conventional zero-shot learning (ZSL) methods recognise an unseen instance by projecting its visual features to a semantic space that is shared by both seen and unseen categories. However, we observe that such a one-way paradigm suffers from the visualsemantic ambiguity problem. Namely, the semantic concepts (e.g. attributes) cannot explicitly correspond to visual patterns, and vice versa. Such...
متن کاملZero-Shot Activity Recognition with Verb Attribute Induction
In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs. For example, the verb “salute” has several properties, such as being a light movement, a social act, and short in duration. We use these attributes as the internal mapping between visual and textual representations to reason about a previously unseen action....
متن کاملZero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Network
We propose a novel framework called SemanticsPreserving Adversarial Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test images and their classes are both unseen during training. SP-AEN aims to tackle the inherent problem — semantic loss — in the prevailing family of embedding-based ZSL, where some semantics would be discarded during training if they are nondiscriminati...
متن کاملZero-shot Recognition via Semantic Embeddings and Knowledge Graphs
We consider the problem of zero-shot recognition: learning a visual classifier for a category with zero training examples, just using the word embedding of the category and its relationship to other categories, which visual data are provided. The key to dealing with the unfamiliar or novel category is to transfer knowledge obtained from familiar classes to describe the unfamiliar class. In this...
متن کاملAlternative Semantic Representations for Zero-Shot Human Action Recognition
A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from still images relevant to human actions. Such side information are accessible on Web with little cost...
متن کامل